1 - Spatial Data Science

If you think that spatial data is just a set of longitude and latitude observations for a given area of geographic space, and that it should be treated as such regardless, you’re missing out on opportunities and have a high chances of incurring in meaningless analysis. Spatial data is not, in fact, just a set of records of observable events and/or facts in geographical space. They are the (potential) starting point for answering relevant questions about various problems in a given reality.

In the New Era of Big Data, where decision-making is increasingly data-driven, Spatial Data Science has emerged as an essential field, with applications spanning many sectors of society (e.g. agriculture, environmental science, urban planning, clean energy generation, sanitation, water management, and much more.

For these and many other reasons, I challenge you to reflect on your convictions about Spatial Data, this important entity in the context of Big Data. A lot of things has changed, is changing and will change faster and faster, especially due to the advance of visual and computational technologies. No matter what your area of interest in this field, these changes presuppose, and to some extent demand, that you develop new skills.

2 -R for Spatial Data Science

At some point in your life you’ve heard someone say that “R is a powerful tool”. This is true, and you don’t need to make a big effort to find n reasons why it is. WWhen I wrote this notebook (using R Markdown), I didn’t do it thinking about explaining what R is and how it works. The purpose of this notebook goes beyond this part, which I assume you already know. After all, my goal with this notebook is to demonstrate, with one possibility, how you can use R to start your studies in Spatial Data Science.

3 - The spatial data in the context of this notebook

In this notebook I start using spatial data on the occurrence (point geometries) of land degraded by erosion processes in natural and antropogenic ecosystems in a watershed located in the Brazilian Semiarid Region. Specifically, the data was collected and organized by me, after successive efforts in various field works in the state of Bahia, between the year 2022 and 2024.

Land degradation is understood as the reduction of soil capacity to generate assets and services, in qualitative and quantitative terms, due to the decline of its productive potential and of its environmental regulation capacity” (LAL, 2012). From this perspective, land degradation is a complex process, associated with natural factors and human activities. According to Silva (2023), the land degradation is closely related to landscape degradation, considering the soil-landscape relationship.

On figure below I present a record (21-06-2023/17:06pm), organized by me, of an area highly degraded by an erosive process (Ravina), with a strong impact in landscape.

4 - Hands on

4.1 - Interactive Maps using point data

To reproduce this document, you need to have R installed on your machine, as well RStudio. RStudio is an Integrated Development Environment (IDE) for R.

In the chunk below you will set your Working Directory. To do this you use setwd( ) and pass the Pass the address.

setwd("D:/Documents/Music/PROJECTSR/SpatialDataScienceinR")

In the chunk below you will install the necessary packages. As I already have these packages installed, note that I left the command as a comment, adding the #. If you don’t have these packages installed, just remove the # and run. To install a package you use install.packages( ).

The first package to install is leafleat, a visualization tool that is highly recommended for those who want Interactive Maps.

The second package to install is sf (Simple Features) package, a second very useful feature for those who work with maps, very interesting for representing and working with spatial vector data including points, polygons, and lines.

#install.packages("leafleat")
#install.packages("sf")

Once installed, in the chunk below you will load the installed packages. All you have to do is use library( ).

library(leaflet)
library(sf)

Here, in the chunk below, you will read the spatial data, in this case, a point geometries (shp.). Note that it will be stored in a variable named shape.

shape <- st_read('D:/Documents/Music/PROJECTSR/SpatialDataScienceinR/SpatialData/LandDegradation.shp')
## Reading layer `LandDegradation' from data source 
##   `D:\Documents\Music\PROJECTSR\SpatialDataScienceinR\SpatialData\LandDegradation.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 2146 features and 12 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -40.69852 ymin: -11.37433 xmax: -40.67941 ymax: -11.34879
## Geodetic CRS:  WGS 84

In the chunk below you can crete your first interactive map using the leaflet package. In this situation, your interactive map show the information of land degradation ocurrence. Note that, the infomation is presented by cluster. If you zoom in, you’ll note that the clusters gradually disappear. This is a very interesting approach when you have a large number of points that can visually pollute the map.

  leaflet(shape) %>%
  setView(lat = -11.351051, lng = -40.684630, zoom = 16) %>% 
  addProviderTiles(providers$Esri.WorldImagery, group = "Esri.WorldImagery") %>%
  addProviderTiles(providers$OpenStreetMap, group = "Open Street Map") %>%
  addLayersControl(baseGroups = c("ESRI World Imagery", "Open Street Map"),
  options = layersControlOptions(collapsed = TRUE)) %>%
  addMarkers(clusterOptions = markerClusterOptions(),
             label = shape$Tipo,
             labelOptions = labelOptions(
             textsize = "14px", 
             style = list(
               "font-weight" = "bold", 
             padding = "5px", 
             lat = ~Lat, lng = ~Long)))

In the chunk below you can create your second interactive map using the leaflet package. Note that this time the information is categorized by the Tipo field (Type of Erosion), and is also associated with colors (yello, orange and red) that reflect the severity of the process.

Tipo <- c("Boçoroca", "Ravina", "Sulcos")

my_palette <- c("red", "orange", "yellow")

#previewColors(colorFactor(my_palette, levels = Tipo), Tipo)

factpal <- colorFactor(my_palette, levels = Tipo)

groups = unique(shape$Tipo)

map = leaflet(shape) %>% 
  setView(lat = -11.351051, lng = -40.684630, zoom = 17.3) %>% 
  addProviderTiles(providers$Esri.WorldImagery, group = "Esri.WorldImagery")
  for (i in groups) {
    data = shape[shape$Tipo == i, ]
    map = map %>% addCircleMarkers(data = data, radius = 4, 
        weight = 2, opacity = 1, fill = TRUE, fillOpacity = 0.2, color = ~factpal(Tipo), group = i)}
map %>% addLayersControl(overlayGroups = groups, options = layersControlOptions(collapsed = FALSE))

4.2 - Interactive Maps using vector data

4.3 - Interactive Maps using raster data

5 - Google Earth Engine

6 - Brazil Data Cube


  1. If you like it and want to contribute: pix: izaiasdesouzasilvaa@gmail.com↩︎